首页> 外文OA文献 >Classification and Diagnostic Output Prediction of Cancer Using Gene Expression Profiling and Supervised Machine Learning Algorithms
【2h】

Classification and Diagnostic Output Prediction of Cancer Using Gene Expression Profiling and Supervised Machine Learning Algorithms

机译:基因表达谱和监督机器学习算法的癌症分类和诊断输出预测

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

In this paper, a new supervised clustering and classification method is proposed. First, the application of discriminant partial least squares (DPLS) for the selection of a minimum number of key genes is applied on a gene expression microarray data set. Second, supervised hierarchical clustering based on the information of the cancer type is subsequently proposed to find key gene groups and to group the cancer samples into different subclasses. Here, the weights of the genes in the DPLS are proportional to their importance in the determination of the class labels, that is, the variable importance in the projection (VIP) information of the DPLS method. The power of the gene selection method and the proposed supervised hierarchical clustering method is illustrated on a three microarray data sets of leukemia, breast, and colon cancer. Supervised machine learning algorithms thus enable the subtype classification 3 data sets solely on the basis of molecular-level monitoring. Compared to unsupervised clustering, the supervised method performed better for discriminating between cancer types and cancer subtypes for the leukemia data set. The performance of the proposed method, using only a limited set of informative genes, is demonstrated to be comparable or better than results reported in the literature for the three data sets. Furthermore the method was successful in predicting the outcome of medical treatment (success or failure) based on the microarray data, which could make the method an important tool for clinical doctors.
机译:本文提出了一种新的监督聚类与分类方法。首先,将判别式偏最小二乘(DPLS)用于选择最小数量的关键基因的方法应用于基因表达微阵列数据集。其次,随后提出了基于癌症类型信息的监督层次聚类,以找到关键基因组并将癌症样本分为不同的亚类。在此,DPLS中基因的权重与它们在确定类别标签中的重要性成正比,即与DPLS方法的投影(VIP)信息中的可变重要性有关。在白血病,乳腺癌和结肠癌的三个微阵列数据集上说明了基因选择方法和拟议的监督分层聚类方法的功能。因此,受监督的机器学习算法仅基于分子水平的监测就可以实现3类亚型的数据集。与无监督聚类相比,有监督方法在区分白血病数据集的癌症类型和癌症亚型方面表现更好。仅使用有限的一组信息基因,提出的方法的性能被证明与三个文献中的文献报道的结果相当或更好。此外,该方法基于微阵列数据成功地预测了治疗结果(成功或失败),这可能使该方法成为临床医生的重要工具。

著录项

  • 作者

    Yoo, C.; Gernaey, Krist;

  • 作者单位
  • 年度 2008
  • 总页数
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号